# Large-scale visual feature extraction
Vit Huge Patch14 Clip Quickgelu 378.dfn5b
Other
ViT-Huge image encoder based on CLIP architecture, trained on DFN5B dataset, supports quick GELU activation
Image Classification
Transformers

V
timm
27
0
Vit Huge Patch14 Clip 378.dfn5b
Other
The visual encoder component of DFN5B-CLIP, based on ViT-Huge architecture, trained with 378x378 resolution images for CLIP model
Image Classification
Transformers

V
timm
461
0
Convnext Xxlarge.clip Laion2b Soup
Apache-2.0
ConvNeXt-XXLarge image encoder based on the CLIP framework, trained by LAION, suitable for multimodal tasks
Image Classification
Transformers

C
timm
220
0
Featured Recommended AI Models